ASD-SWNet: a novel shared-weight feature extraction and classification network for autism spectrum disorder diagnosis

The traditional diagnostic process for autism spectrum disorder (ASD) is subjective, where early and accurate diagnosis significantly affects treatment outcomes and life quality. Thus, improving ASD diagnostic methods is critical. This paper proposes ASD-SWNet, a new shared-weight feature extraction and classification network. It resolves the issue found in previous studies of inefficiently integrating unsupervised and supervised learning, thereby enhancing diagnostic precision. The approach utilizes functional magnetic resonance imaging to improve diagnostic accuracy, featuring an autoencoder (AE) with Gaussian noise for robust feature extraction and a tailored convolutional neural network (CNN) for classification. The shared-weight mechanism utilizes features learned by the AE to initialize the convolutional layer weights of the CNN, thereby integrating AE and CNN for joint training. A novel data augmentation strategy for time-series medical data is also introduced, tackling the problem of small sample sizes. Tested on the ABIDE-I dataset through nested ten-fold cross-validation, the method achieved an accuracy of 76.52% and an AUC of 0.81. This approach surpasses existing methods, showing significant enhancements in diagnostic accuracy and robustness. The contribution of this paper lies not only in proposing new methods for ASD diagnosis but also in offering new approaches for other neurological brain diseases.


Datasets and data preprocessing
All research in this manuscript complies with all relevant ethical regulations.The online available ABIDE dataset is utilized in this study.Ethical approval was not required as confirmed by the license attached with the openaccess data, since they were previously approved by each site's local IRB.All data were obtained with informed consent from the subjects.To gain a deeper understanding of the neural mechanisms underlying ASD, the ABIDE dataset 28 compiles brain MRI data from ASD patients and healthy developing children (HC).The dataset and preprocessed information used in this study are shown in Table 1.
As shown in Table 1, we preprocess the ABIDE data using the configurable pipeline for the analysis of connectomes (CPAC) 29 .Hence, we preprocess the ABIDE-I data using CPAC, discarding samples with anomalies or missing time series.We ultimately utilize data from 871 subjects in ABIDE-I, including 419 ASD individuals and 452 HC.The data processing assistant for resting-state fMRI (DPARSF) 30 is employed for preprocessing the

Improved data augmentation algorithm
The performance of the model is significantly influenced by both the size and the quality of the dataset.To address the challenges of limited and imbalanced data samples, this study improved the data augmentation algorithm proposed by Eslami et al. 32 .
The inspiration for the data augmentation algorithm comes from the synthetic minority oversampling technique 33,34 .This algorithm creates new data in the feature space by utilizing the nearest neighbor of the sample.First, find K-nearest neighbor x1, x2, ..., xk of sample x .Then, we randomly select a sample xi from the K-nearest neighbor, and use Eq.(3) to generate a new feature vector x ′ : where ϕ is the threshold value, which is a random number between [0,1].Generally, the K-nearest neighbor of sample x is calculated based on Euclidean distance.However, the Euclidean distance can only reflect the overall relationship between sequences, rather than the local changes 35 .To find the K-nearest neighbor, we introduce principal component analysis (PCA) 36 of the extended Frobenius norm (EROS) 37 .
First, estimations are made for the covariance matrices of two multivariate time series (MTS), and then their eigenvalues and eigenvectors are calculated.Finally, the similarity of each MTS element is measured by the eigenvalues obtained in MTS.The EROS distance between MTS A and B is calculated, as shown in Eq. ( 4): where ω is the weight vector matrix.The singular value decomposition is performed on the covariance matri- ces of A and B to obtain the eigenvectors VA = [a1, a2, ..., an] and VB = [b1, b2, ..., bn] .In Eq. ( 4), ai and bi are orthogonal column vectors with length n.Equation ( 4) can be further simplified into Eq.( 5): The key operation of PCA is to simplify the original data.For MTS A and B, the principal elements of each matrix are obtained first, and then z principal elements are selected.The similarity among z principal elements is calculated by Eq. ( 6): where θij is the angle between the ith pivot of A and the jth pivot of B. Then, the similarity of EROS of A and B is defined as Eq. ( 7): (3) where θi is the angle between ai and bi .ω is the weighted vector of the eigenvalues of the MTS dataset.First, the eigenvalues in each MTS are normalized, and subsequently, all the eigenvalues in the whole dataset are normalized by the aggregation function, such that The dimension of the covariance matrix for each sample is n × n , where n is the number of ROIs of the atlas.The covariance matrix of each dataset is calculated in advance.Research conducted by Eslami et al. 38 on ADHD indicated that EROS effectively measures similarity in fMRI data.PCA of EROS can be applied to determine the K-nearest neighbor distance.In our approach, after identifying the K-nearest neighbor for each sample in the training set and randomly choosing one, we utilize linear interpolation between the selected sample and its nearest neighbor to produce new samples.This algorithm is used to enhance the data.A composite sample is created for each data in the dataset and placed into the dataset.By setting different augmentation factor γ , the size of the dataset is increased to γ times the original.The pseudocode of our proposed data augmentation algorithm is shown in Algorithm 1.

Unsupervised learning feature extraction method
Autoencoders, as an unsupervised learning method 39 , aim to reduce dimensionality and extract features by learning compressed representations of data.Their fundamental design includes an encoder, a bottleneck layer, and a decoder.The encoder transforms the input data into a lower-dimensional latent space, as depicted by the bottleneck layer.Subsequently, the decoder reverts this latent representation to the original data space, aiming to reduce the reconstruction error.The denoising autoencoder (DAE) is a variation of the standard autoencoder, designed to learn robust representations by introducing noise into the input data.The DAE undergoes training to reduce the discrepancy between the input data with added noise and the decoder's output, compelling the model to acquire feature representations that are resilient to noise.This training process enables the DAE to extract features that are resilient to variations in the data, enhancing the model's robustness and generalizability.In our study, the DAE is utilized for low-level feature extraction from data and also to reduce the feature dimensions to a manageable range.
There are many ways to set noise in DAEs.To retain more information from the original data and simulate random errors in the data, in this study, we add Gaussian noise to the preprocessed data x to obtain a new data representation x ′ .Eq. (8) shows how Gaussian noise is added to the original data: where ε signifies random noise drawn from a Gaussian distribution, characterized by a mean of 0 and a stand- ard deviation of σ .The mean squared error loss function is utilized to determine the error between the original input x and the reconstructed input y , with the aim of training the autoencoder to minimize this reconstruction error.In the case of the CC200 atlas, the encoder and decoder were configured with 19,900 nodes each, and the bottleneck layer was allocated 2000 nodes, facilitating a reduction in dimensionality.Our DAE model designed for unsupervised feature extraction is shown in Fig. 1.

Supervised learning with convolutional neural networks
Convolutional neural networks can extract high-level features from data and classify samples.However, the latest network models generally have many layers, which is not suitable for the characteristics of the ASD dataset.Therefore, we designed a convolutional neural network model for the automatic diagnosis of ASD, which involves convolutional layers, batch normalization (BN), activation function layers, max pooling, dropout, fully connected layers, and a sigmoid activation function.Our proposed classification model framework is shown in Fig. 2.
CNN extracts a set of high-level features from the low-level features generated after DAE feature extraction.Convolutional layers apply convolution operations to feature maps through kernels, automatically learning and extracting features from the input data.Batch Normalization 40 is typically applied after convolution operations.BN enhances the training stability of the network, accelerates convergence, and improves model generalizability.However, BN requires the computation of mean and variance in each batch and introduces scaling and shifting parameters, increasing computational overhead and network complexity.Therefore, in our designed CNN model, BN layers were not added in the middle three convolution operations.
Max pooling layers are a key component of our designed CNN network model, with one of their primary functions being feature dimensionality reduction.In addition, max pooling layers select the most prominent feature values in each region, extracting important information from the original data, and enhancing the model's generalization ability and noise resistance.Moreover, max pooling layers contribute to reducing the risk of overfitting by simplifying the model through dimensionality reduction, particularly vital for tasks with limited medical imaging data.
Dropout layers randomly set the output of a portion of neurons to zero during training, serving as an effective regularization technique.The primary goal of Dropout is to mitigate the risk of overfitting, forcing the model not to rely on any single neuron.It helps reduce the coupling between neurons, thereby enhancing the model's robustness.
The Sigmoid function 41 serves as the activation function for the output layer, facilitating binary classification and mapping the network's output to values indicative of probabilities.

Shared weights network framework
We have constructed the feature extraction method for unsupervised learning and the convolutional neural network for supervised learning.Next, we implement weight sharing between AE and CNN.Specifically, we construct an autoencoder (Fig. 1) and train it on input data in an unsupervised manner to minimize reconstruction error so that the decoder output is as close as possible to the input data.The training process uses standard backpropagation algorithms and mean squared error loss functions.The feature representation extracted from the bottleneck layer of AE is used for the subsequent CNN input.We build a convolutional neural network model (Fig. 2) and connect the entire model (comprising convolutional layers, fully connected layers, and the bottleneck layer of the autoencoder) to form an end-to-end model.Once AE training is complete, we initialize the weights of its bottleneck layer as part of the CNN, thus achieving weight sharing.Meaning, that the feature representation acquired by the AE is utilized to initialize the weights of the convolutional layers in the CNN.This is intended to align CNN's initial feature representation more closely with the useful characteristics of the data.Subsequently, the low-dimensional feature representation obtained from the AE bottleneck layer is input into the model to start joint training of the CNN model, including the shared weight AE and additional CNN layers.Throughout the training phase, the primary objective of the model is to minimize the cross-entropy classification loss function.We use backpropagation algorithms to update the weights of the shared weight autoencoder part and CNN layers to reduce classification loss.Through backpropagation and gradient descent optimization algorithms, we update the entire model's weights to achieve optimal performance in classification tasks.Our proposed ASD-SWNet method for diagnosing autism spectrum disorder with shared weights is shown in Fig. 3.

Experiment settings
The experiments of this study are primarily conducted on the ABIDE-I dataset using nested ten-fold crossvalidation and on the ABIDE-II dataset using leave-one-out cross-validation.Nested ten-fold cross-validation is executed on two levels to overcome bias in model selection and performance evaluation.ASD-SWNet hyperparameter settings and CNN configurations are shown in Tables 2 and 3.All models were developed using the open-source machine learning library PyTorch and experiments were conducted on a GeForce GTX 4060 GPU.For performance evaluation 42 , accuracy, precision, recall, F1-score, and the area under the ROC curve (AUC) were used as metrics.

Effect of improved date augmentation on results
To verify the impact of the data augmentation algorithm on model performance, we performed nested ten-fold crossover on ASD-SWNet under the conditions of data augmentation factor γ = 1 (no data augmentation), γ = 2 , γ = 3 , and γ = 4 respectively.verify.Figure 4 shows the change in model performance with increasing levels of data augmentation.With a data augmentation factor of γ = 1 , the model's accuracy reached 75.37%, with an AUC of 0.80.With- out the use of data augmentation, our model demonstrated high performance.At γ = 2 , the model's accuracy reached 76.52%, with an AUC of 0.81, indicating a high overall performance.As we continued to enhance the dataset, at γ = 3 and γ = 4 , the model's accuracy, precision, and recall all improved.Compared to methods  not utilizing data augmentation, at γ = 4 , the model's accuracy improved by 1.31%, precision by 0.39%, recall by 1.44%, and AUC by 1%.The significant enhancement in model performance adequately demonstrates the effectiveness of our proposed data augmentation algorithm.From Fig. 4, it can be observed that the greatest improvement in model performance occurs at γ = 2 .As the data augmentation factor increases, the extent of improvement in model performance becomes limited.At γ = 3 , the model's accuracy improved by 0.13%.At γ = 4 , the accuracy increase was merely 0.03%.Such marginal growth is difficult to classify as a breakthrough in model performance.This may be attributed to excessive enhancement and increased model complexity.Overutilization of data augmentation may lead to model overfitting, where excessive data variability causes the model to learn noise rather than meaningful patterns.Considering the running time, we noted that with the increase in the data augmentation factor, the running time showed a trend of nearly proportional growth.This is not conducive to the model's training.Therefore, combining all performance metrics and running time, we believe that the model achieves optimal performance at γ = 2.
Reflecting on Fig. 4, the enhanced data augmentation's impact on ASD-SWNet underscores our commitment to advancing ASD diagnostics through innovative machine learning techniques.By achieving a model accuracy of 76.52% and an AUC of 0.81 with an augmentation factor of 2, we not only address the challenge of data scarcity and imbalance but also enhance the model's robustness and generalizability.This advancement directly aligns with our core motivation: to develop a reliable, efficient tool for the early and accurate diagnosis of autism spectrum disorder.Our contributions are twofold: introducing a novel data augmentation strategy that significantly improves diagnostic performance, and, more broadly, pushing the boundaries of how machine learning can be applied in medical image processing to benefit clinical practices and patient outcomes.

Performance evaluation
Our proposed method, ASD-SWNet, was evaluated by comparing it against baseline and state-of-the-art (SOTA) 43 models on the ABIDE-I dataset.Unless specified, our experiments were carried out under the condition γ = 2 data augmentation.SVM 44 , HOFC 45 , and GCN 46 were used as baselines, while recent studies such as ASD- DiagNet 32 , ASD-SAENet 47 , and Hi-GCN 48 were considered SOTA models.It is noteworthy that to ensure consistency in experimental conditions, we used nested ten-fold cross-validation instead of the traditional approach.This method of cross-validation involved re-partitioning the dataset during model replication, with the newly partitioned test sets separated from the model training and feature selection processes.Nested cross-validation provides a more reliable and unbiased performance estimate for the model.
ASD-DiagNet implements a joint learning process utilizing autoencoders and single-layer perceptrons to enhance the quality of feature extraction and optimize model parameters.ASD-SAENet calculates functional connectivity between brain regions using the Pearson correlation coefficient and performs classification using sparse autoencoders.Hi-GCN, on the other hand, concurrently considers the internal structure of individual brain functional networks and the structure of the entire population network, effectively learning high-level embedding representations of brain networks.The Hi-GCN framework comprises two separate graph convolutional networks (GCNs), each designed to model individual brain functional networks and the overall population graph network, respectively.The graph-level embedding learning of individual brain functional networks employs a GCN named f-GCN, while the graph-level embedding learning for the entire population network utilizes another GCN, termed p-GCN.During the training of p-GCN, graph kernels are used to measure the similarity between brain networks.The joint learning approach of Hi-GCN effectively facilitates the learning of high-level embedding representations of brain networks.
Table 4 displays the classification results on the ABIDE-I dataset.It can be observed that the ASD-SWNet method achieved the highest performance with an accuracy of 76.52%, precision of 76.15%, recall of 80.65%, and an AUC of 0.81.The network with shared weights effectively combines unsupervised and supervised learning methods, endowing the model with enhanced classification capabilities.SVM is capable of extracting useful features from data for classification, yet it exhibits relatively poor performance.HOFC achieves a high performance with 71.13% accuracy and an AUC of 0.77.This indicates that, compared to FC, higher-order functional connectivity can more sensitively detect population differences and better capture individual variations.Our proposed method still manages to improve accuracy by 5.39% and AUC by 4%.The GCN approach uses non-imaging data of patients as a supplement, allowing the model to learn more meaningful features.However, due to the limitations of GCN, with shallow model layers unable to learn deeper features, it only achieves an accuracy of 70.58% and an AUC of 0.72.Among all SOTA methods, Hi-GCN achieved the best performance with an accuracy of 72.63% and an AUC of 0.79, followed by ASD-SAENet with an accuracy of 71.05% and an AUC of 0.73.The lowest-performing SOTA model was ASD-DiagNet, with only 69.07%accuracy and 0.71 AUC.ASD-DiagNet uses an AE and an SLP for feature extraction and classification, sharing losses between AE and SLP, which leads to lower model complexity and effectively reduces overfitting while enhancing model generalizability.ASD-SAENet also shares losses but differs from ASD-DiagNet by using a DNN as the classifier.The multi-layer feature representation of ASD-SAENet results in superior performance, indicating the significant impact of appropriate model architecture even in tasks with smaller datasets.Hi-GCN's two-level GCN effectively utilizes both imaging and non-imaging data, demonstrating the efficacy of joint learning in deriving structural brain region information and aggregating embeddings of adjacent topics to learn advanced brain network representations.Our proposed ASD-SWNet showed an improvement of 3.89% in accuracy and 2% in AUC.This highlights the effectiveness of combining unsupervised feature extraction in weight-shared networks with supervised classification methods, along with the crucial role of data augmentation algorithms in enhancing model performance.Additionally, we employed student's t-tests for statistical comparison between our method and others, setting the significance level α = 0.01 to ascertain statistical significance.As shown in Table 4, all p-values were less than 0.01.Rigorous statistical analysis confirms the significant advantages of our proposed method.
Table 4 illustrates that ASD-SWNet outperforms baseline and state-of-the-art models, achieving an accuracy of 76.52%, precision of 76.15%, recall of 80.65%, and an AUC of 0.81 on the ABIDE-I dataset.This marked improvement emphasizes the effectiveness of our shared-weight mechanism and the synergy between unsupervised feature extraction and supervised learning classification.Our method's superiority not only reaffirms the motivations behind this work-to address the pressing need for accurate, reliable ASD diagnostics through innovative machine learning techniques-but also highlights its significant contributions to the field.By enhancing the precision and reliability of ASD diagnosis, ASD-SWNet stands as a testament to the potential of advanced computational models in overcoming the limitations of current diagnostic methodologies, offering new avenues for early intervention and treatment planning.This analysis underscores the model's innovative approach to leveraging machine learning for significant clinical impact, marking a pivotal contribution to the ongoing efforts to improve ASD diagnostic practices.
To verify the robustness of ASD-SWNet, we employed LOOCV on the ABIDE-II dataset.We selected topical research from the last three years on ABIDE-II as comparative experiments.AL-NEGAT 49 is an adversarial learning-based node-edge graph attention network.It utilizes a novel attention-based adjacency matrix to perform graph convolution operations on these features.The model also adopts adversarial training methods to enhance its robustness and generalizability.STCAL 50 includes a sliding cluster attention (SCA) module and a guided co-attention (GCA) module.The SCA module is used to extract dynamic local feature representations, while the GCA module is employed for joint learning of spatio-temporal attention representations.Based on SCA and GCA, a co-attention learning network is further established to perform feature representation and fusion.Finally, classification tasks are realized by linking a simple attention-aware classification network.
Results from Fig. 5 demonstrate that ASD-SWNet outperforms all performance metrics on the ABIDE-II dataset.AL-NEGAT, which integrates node and edge features into graph classification tasks, showed improved classification performance with an accuracy of 69.91% and an AUC of 0.73.The adoption of adversarial training methods effectively enhanced the model's robustness and generalizability.This result underscores the effectiveness of graph network-based approaches for fusing multimodal information to define graph features, leveraging both structural and functional information.STCAL achieved an accuracy of 70.25% and an AUC of 0.74, attributable to the self-attention mechanism providing new insights for dynamically detecting key time frames in time-series fMRI data.STCAL dynamically captures crucial time frames and spatio-temporal correlations in time-series fMRI data, thus improving diagnostic accuracy for psychiatric disorders.ASD-SWNet, even without data augmentation ( γ = 1 ), already surpasses the other two SOTA methods.Compared to AL-NEGAT, ASD-SWNet ( γ = 1 ) shows a 2.89% increase in accuracy, highlighting the superiority of our proposed method With the application of data augmentation algorithms, ASD-SWNet reached an accuracy of 73.92%, a precision of 72.45%, a recall of 78.53%, and an AUC of 0.78.The data augmentation algorithm increased the sample size, addressed imbalanced data, effectively reduced the risk of overfitting, and enhanced the model's performance and robustness.

Leave-one-site-out cross validation
To verify the model's cross-site capability, we executed leave-one-site-out cross-validation 51 under the condition of no data augmentation.First, we grouped the 871 samples from the ABIDE-I dataset by site, obtaining a total of 17 sites.Secondly, for each site, we used its data as the validation set, while the data from the other sites were combined to form the training set.Thereafter, we trained the model and validated it at each site.This process was repeated until each site had been validated once.Table 5 reports the accuracy in comparison with other methods.We validated GCN 46 , ASD-DiagNet 32 , and ASD-SAENet 47 on the CC200 atlas.The results show that our method performed better in terms of accuracy in 12 out of the 17 sites.ASD-SWNet achieved an average accuracy of 72.40%, surpassing the comparative methods.Consistent with the findings of Wang et al. 52 , the accuracy rates at sites such as MAX_MUN, SDSU, STANFORD, TRINITY, and USM were all below 70%.This indicates the presence of heterogeneity not found in data from other sites.www.nature.com/scientificreports/ Our ASD-SWNet model showcases a superior cross-site capability, outperforming other models at the majority of sites.This achievement not only emphasizes the model's adaptability and effectiveness across different datasets but also marks a significant contribution towards enhancing the precision and reliability of ASD diagnostics.In summary, ASD-SWNet outperforms other SOTA methods at more sites, demonstrating the superior cross-site capability and robustness of the model we proposed.

Ablation study
Having verified the effectiveness of the data augmentation algorithm, we next designed ablation experiments 53 to validate other modules proposed in the study.Ablation experiments systematically dismantle various components of the model, facilitating a better verification of the ASD-SWNet's effects and allowing for an understanding of its internal mechanisms, constituent elements, and contributions.Table 6 presents the results of the ablation experiments.
CNN, with its inherent strong feature extraction capability, achieved an accuracy of 73.29%, an F1-score of 75.43%, and an AUC of 0.78 when solely responsible for feature extraction and classification.This indicates that the CNN designed in this study demonstrates robust classification ability for ASD data, validating the effectiveness of CNN.Building on this, we first utilized DAE to extract low-dimensional features from the data, then input these preliminarily extracted features into our designed CNN for further feature extraction and completion of the classification task.After processing by DAE, the model's accuracy improved by 1.57%, the F1-score by 0.92%, and the AUC by 1%, indicating the beneficial impact of DAE's low-dimensional feature processing on the model's performance.Furthermore, sharing weights between DAE and CNN during training, as seen in the results from Table 6, led to a 1.88% increase in accuracy, a 1.81% increase in F1-score, and an approximate 2% increase in AUC compared to methods without weight sharing.This suggests that the weight-sharing approach aids the model in effectively identifying more meaningful features, thereby enhancing performance.AE can learn effective feature representations of data, which CNN can leverage.Additionally, AE helps in learning redundant information in the data, enabling CNN to reduce redundant feature representations.Finally, AE, through nonlinear mapping, learns complex relationships in data, which CNN can utilize to better handle image data.In summary, in our designed weight-shared model, CNN benefits from the feature representations learned by AE, eliminating the need to learn these features from scratch, and thereby improving model performance.Ablation experiments indicate that each component of our proposed method enhances the model's performance and contributes to its robustness.
This systematic component evaluation aligns perfectly with our motivation to develop a robust, accurate diagnostic tool for ASD that leverages the full potential of machine learning.The ablation study not only demonstrates the individual contributions of CNN, DAE, and shared weights but also collectively emphasizes the innovative approach of integrating these components to improve diagnostic accuracy.This methodological advancement contributes significantly to the field of ASD diagnosis, showcasing a novel, effective way to address the challenges of ASD classification using machine learning techniques.

Visualization
We employed the t-SNE technique to visualize both the original data and the results post-classification by ASD-SWNet in a 2D space.This was done to observe the performance of our proposed method in terms of feature fusion and classification.The visualization results are illustrated in Fig. 6.As shown in Fig. 6a, the original data distribution of ASD and HC is random and disordered, with no clear boundary between the two categories, indicating that classification is highly challenging.After classification by ASD-SWNet, the two-dimensional feature visualization of the classification results is presented in Fig. 6b.Here, the blue nodes are clustered together, showing a very tight distribution.The orange nodes display a similar distribution pattern.This suggests that the features post-ASD-SWNet classification exhibit good intra-class cohesion.Furthermore, the subgraphs corresponding to the blue and orange nodes are overall similar, indicating that the method distinctly separates ASD patients from the healthy control group, which means the classified features have notable inter-class discriminability.The data visualization results thoroughly demonstrate the exceptional classification capability of our proposed ASD-SWNet model.

Conclusion
This study proposes an autism spectrum disorder diagnostic method based on a shared weights network.It incorporates a data augmentation algorithm, an unsupervised learning feature extraction method, and a supervised learning convolutional neural network classification approach.The data augmentation algorithm addresses the issues of insufficient data leading to inadequate model training and overfitting.Utilizing an AE to extract lowdimensional features from the original data and performing preprocessing, these features are then input into the CNN to further extract high-dimensional features for classification tasks.Notably, this study achieves weight sharing between AE and CNN, linking the entire model together, enabling the CNN to extract more meaningful features, enhancing the performance of the classification model, and effectively combining unsupervised and supervised learning.Compared to existing methods, our proposed approach achieves the highest performance in identifying ASD and offers a broad prospect for clinical application.Lastly, this research also has certain limitations; we did not consider the supplementary role of non-imaging data, such as patient age, gender, and family history, as would be queried by a clinician.In the future, we plan to use graph neural networks to combine non-imaging data to explore its impact on model performance.Graph edges can effectively encode associations between nodes, offering more possibilities for multimodal ASD diagnosis.

Figure 2 .
Figure 2. Supervised learning classification convolutional neural network model framework.

Figure 3 .
Figure 3.The proposed overall network framework of ASD-SWNet.

Figure 4 .
Figure 4. Effects of data augmentation on model performance.

Figure 5 .
Figure 5. Results of different models on ABIDE-II.

Figure 6 .
Figure 6.2D space visualization results.Green nodes denote autism spectrum disorder and red nodes denote healthy controls.(a) 2D Space visualization of original features; (b) 2D Space visualization following ASD-SWNet features.

Table 1 .
Dataset and preprocessing information.

Table 4 .
Performance of our method and existing models.Significant values are given in bold.
combining unsupervised feature extraction in a weight-shared network with supervised CNN classification.

Table 5 .
Performance of our method and existing models.Significant values are given in bold.

Table 6 .
Results of ablation experiment.